Search CORE

53 research outputs found

Acquisition and modeling of lexical knowledge: a corpus-based investigation of systematic polysemy

Author: Lapata Maria
Publication venue: The University of Edinburgh
Publication date: 01/01/2000
Field of study

Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis

Author: Angelidis Stefanos
Lapata Maria
Publication venue
Publication date: 01/01/2018
Field of study

We consider the task of fine-grained sentiment analysis from the perspective of multiple instance learning (MIL). Our neural model is trained on document sentiment labels, and learns to predict the sentiment of text segments, i.e. sentences or elementary discourse units (EDUs), without segment-level supervision. We introduce an attention-based polarity scoring method for identifying positive and negative text snippets and a new dataset which we call SPOT (as shorthand for Segment-level POlariTy annotations) for evaluating MIL-style sentiment models like ours. Experimental results demonstrate superior performance against multiple baselines, whereas a judgement elicitation study shows that EDU-level opinion extraction produces more informative summaries than sentence-based alternatives.Comment: Final published version. Please cite using appropriate date (2018). Link to journal: http://www.transacl.org/ojs/index.php/tacl/article/view/1225/27

arXiv.org e-Print Archive

Edinburgh Research Explorer

Learning Structured Text Representations

Author: Lapata Maria
Liu Yang
Publication venue
Publication date: 01/01/2018
Field of study

In this paper, we focus on learning structure-aware document representations from data without recourse to a discourse parser or additional annotations. Drawing inspiration from recent efforts to empower neural networks with a structural bias, we propose a model that can encode a document while automatically inducing rich structural dependencies. Specifically, we embed a differentiable non-projective parsing algorithm into a neural model and use attention mechanisms to incorporate the structural biases. Experimental evaluation across different tasks and datasets shows that the proposed model achieves state-of-the-art results on document modeling tasks while inducing intermediate structures which are both interpretable and meaningful.Comment: change to one-based indexing, published in Transactions of the Association for Computational Linguistics (TACL), https://transacl.org/ojs/index.php/tacl/article/view/1185/28

arXiv.org e-Print Archive

Edinburgh Research Explorer

What’s This Movie About? A Joint Neural Network Architecture for Movie Content Analysis

Author: Gorinski Philip John
Lapata Maria
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Crossref

Edinburgh Research Explorer

The Automatic Interpretation of Nominalizations

Author: Lapata Maria
Publication venue
Publication date: 01/01/2000
Field of study

This paper discusses the interpretation of nominalizations in domain independent wide-coverage text. We present a statistical model which interprets nominalizations based on the cooccurrence of verb-argument tuples in a large balanced corpus. We propose an algorithm which treats the interpretation task as a disambiguation problem and achieves a performance of approximately 80 % by combining partial parsing, smoothing techniques and domain independent taxonomic information (e.g., WordNet)

CiteSeerX

Edinburgh Research Explorer

Acquiring Lexical Generalizations from Corpora: A Case Study for Diathesis Alternations

Author: Lapata Maria
Publication venue
Publication date: 01/01/1999
Field of study

This paper examines the extent to which verb aliathesis alternations are empirically attested in corpus data. We automatically acquire alternating verbs from large balanced corpora by using partialparsing methods and taxonomic information, and discuss how corpus data can be used to quantify linguistic generalizations. We estimate the productivity of an alternation and the typicality of its members using type and token frequencies

CiteSeerX

Edinburgh Research Explorer

Discourse Representation Structure Parsing

Author: Cohen Shay
Lapata Maria
Liu Jiangming
Publication venue
Publication date: 01/01/2018
Field of study

Crossref

Edinburgh Research Explorer

A Corpus-based Account of Regular Polysemy: The Case of Context-sensitive Adjectives

Author: Lapata Maria
Publication venue
Publication date: 01/01/2001
Field of study

In this paper we investigate polysemous adjectives whose meaning varies depending on the nouns they modify (e.g., fast ). We acquire the meanings of these adjectives from a large corpus and propose a probabilistic model which provides a ranking on the set of possible interpretations. We identify lexical semantic information automatically by exploiting the consistent correspondences between surface syntactic cues and lexical meaning. We evaluate our results against paraphrase judgments elicited experimentally from humans and show that the model's ranking of meanings correlates reliably with human intuitions: meanings that are found highly probable by the model are also rated as plausible by the subjects

CiteSeerX

Edinburgh Research Explorer

Explainable Abuse Detection as Intent Classification and Slot Filling

Author: Calabrese Agostina
Lapata Maria Mirella
Ross Björn
Publication venue: 'MIT Press - Journals'
Publication date: 23/12/2022
Field of study

Edinburgh Research Explorer

Semantic Graph Parsing with Recurrent Neural Network DAG Grammars

Author: Fancellu Federico
Gilroy Sorcha
Lapata Maria
Lopez Adam
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 20/10/2019
Field of study

Semantic parses are directed acyclic graphs (DAGs), so semantic parsing should be modeled as graph prediction. But predicting graphs presents difficult technical challenges, so it is simpler and more common to predict the linearized graphs found in semantic parsing datasets using well-understood sequence models. The cost of this simplicity is that the predicted strings may not be well-formed graphs. We present recurrent neural network DAG grammars, a graph-aware sequence model that ensures only well-formed graphs while sidestepping many difficulties in graph prediction. We test our model on the Parallel Meaning Bank---a multilingual semantic graphbank. Our approach yields competitive results in English and establishes the first results for German, Italian and Dutch.Comment: 9 pages, to appear in EMNLP201

arXiv.org e-Print Archive

Edinburgh Research Explorer